Causal Networks Learning Acausal Networks Learning Influence Diagrams Learning Causal-Network Parameters Learning Causal-Network Structure Learning Hidden Variables Learning More General Causal Models Advances: Learning Causal Networks

نویسنده

  • David Heckerman
چکیده

Bayesian methods have been developed for learning Bayesian networks from data. Most of this work has concentrated on Bayesian networks interpreted as a representation of probabilistic conditional independence without considering causation. Other researchers have shown that having a causal interpretation can be important, because it allows us to predict the effects of interventions in a domain. In this chapter, we extend Bayesian methods for learning acausal Bayesian networks to causal Bayesian networks. CONTENTS Causal Networks Learning Acausal Networks Learning Influence Diagrams Learning Causal-Network Parameters Learning Causal-Network Structure Learning Hidden Variables Learning More General Causal Models Advances: Learning Causal Networks Page 2 of 33 Ch 11 060522 V08 There has been a great deal of recent interest in Bayesian methods for learning Bayesian networks from data (Spiegelhalter & Lauritzen, 1990; Cooper & Herskovits, 1991, 1992; Buntine, 1991 & 1994; Spiegelhalter, Dawid, Lauritzen, & Cowell, 1993; Madigan & Raftery 1994, Heckerman et al. 1994, 1995). These methods take prior knowledge of a domain and statistical data, and construct one or more Bayesian-network models of the domain. Most of this work has concentrated on Bayesian networks interpreted as a representation of probabilistic conditional independence. Nonetheless, several researchers have proposed a causal interpretation for Bayesian networks (Pearl &Verma, 1991; Spirtes, Glymour, & Schienes, 1993; Heckerman & Shachter, 1994). These researchers show that having a causal interpretation can be important, because it allows us to predict the affects of interventions in a domain—something that cannot be done without a causal interpretation. In this paper, we extend Bayesian methods for learning acausal Bayesian networks to causal Bayesian networks. We offer two contributions. One, we show that acausal and causal Bayesian networks (or acausal and causal networks, for short) are significantly different in their semantics, and that it is inappropriate to blindly apply methods for learning acausal networks to causal networks. Two, despite these differences, we identify circumstances in which methods for learning acausal networks are applicable to learning causal networks. In the Causal Networks Section, we describe a causal interpretation of Bayesian networks developed by Heckerman and Shachter (1994, 1995) that is consistent with Pearl’s causal-theory interpretation (e.g., Pearl & Verma, 1991; Pearl, 1995). We show that any causal network can be represented as a special type of influence diagram. In the Learning Acausal Networks Section, we review Bayesian methods for learning acausal networks. We emphasize two common assumptions and one property for learning network structure that greatly simplify the learning task. One asAdvances: Learning Causal Networks Page 3 of 33 Ch 11 060522 V08 sumption is parameter independence, which says that the parameters associated with each node in an acausal network are independent. The other assumption is parameter modularity, which says that if a node has the same parents in two distinct networks, then the probability distributions for the parameters associated with this node are identical in both networks. The property is hypothesis equivalence, which says that two network structures that represent the same assertions of conditional independence correspond to the same random-sample assumption. In the Learning Influence Diagrams Section, we show how methods for learning acausal networks can be adapted to learn ordinary influence diagrams. In the Learning Causal-Network Parameters Section, we identify problems with this approach when learning influence diagrams that correspond to causal networks. We identify two assumptions, called mechanism independence and component independence that circumvent these problems. In the Learning Causal-Network Structure Section, we argue that the assumption of parameter modularity is also reasonable for learning causal networks. Also, we argue that although the assumption of hypothesis equivalence is inappropriate for causal networks, we can often assume likelihood equivalence, which says that data can not help to discriminate causal network structures that are equivalent when interpreted as acausal networks. Given the assumptions of parameter independence, parameter modularity, likelihood equivalence, mechanism independence, and component independence, we show that methods for learning acausal networks can be used to learn causal networks. We assume that the reader is familiar the concept of random sample, the distinction between subjective and objective probability (which we call probability and physical probability, respectively), and the distinction between chance and decision variables. (We sometimes refer to a decision variable simply as a “decision.”) We consider the problem of modeling relationships in a domain consisting of chance variables U and decision variables D. We use lower-case letters to Advances: Learning Causal Networks Page 4 of 33 Ch 11 060522 V08 represent single variables and upper-case letters to represent sets of variables. We write k x = to denote that variable x is in state k. When we observe the state for every variable in set X, we call this set of observations a state of X, and write k X = . Sometimes, we leave the state of a variable or a set of variables implicit. We use ) , | ( ξ k Y j X p = = to denote the (subjective) probability that given for a person whose state of information is j X = k Y = ξ ; whereas, we use to denote the physical probability of this conditional event. ) | ( k Y j X pp = = An influence diagram for the domain is a model for that domain having a structural component and a probabilistic component. The structure of an influence diagram is a directed acyclic graph containing (square) decision and (oval) chance nodes corresponding to decision and chance variables, respectively, as well as information and relevance arcs. Information arcs, which point to decision nodes, represent what is known at the time decisions are made. Relevance arcs, which point to chance nodes, represent (by their absence) assertions of conditional independence. Associated with each chance node D U U x in an influence diagram are the probability distributions ) ), ( | ( ξ x Pa x p , where are the parents of ) (x Pa x in the diagram. These distributions in combination with the assertions of conditional independence determine the joint distributions ) , | ( ξ D U p . A special kind of chance node is the deterministic node (depicted as a double oval). A node x is a deterministic node if its corresponding variable is a deterministic function of its parents. Also, an influence diagram may contain a single distinguished node, called a utility node that encodes the decision maker’s utility for each state of the node’s parents. A utility node is a deterministic function of its predecessors and can have no children. Finally, for an influence diagram to be well formed, its decisions must be totally ordered by the influence-diagram structure (for more details, see Howard & Matheson, 1981). An acausal Bayesian network is an influence diagram that contains no decision nodes (and, Advances: Learning Causal Networks Page 5 of 33 Ch 11 060522 V08 therefore, no information arcs). That is, an acausal Bayesian network represents only assertions of conditional independence (for more details, see Pearl, 1988).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Introduction to Inference and Learning in Bayesian Networks

Bayesian networks (BNs) are modern tools for modeling phenomena in dynamic and static systems and are used in different subjects such as disease diagnosis, weather forecasting, decision making and clustering. A BN is a graphical-probabilistic model which represents causal relations among random variables and consists of a directed acyclic graph and a set of conditional probabilities. Structure...

متن کامل

A Bayesian Approach to Learning Causal Networks

Whereas acausal Bayesian networks represent probabilistic independence, causal Bayesian networks represent causal relationships. In this paper, we examine Bayesian methods for learning both types of networks. Bayesian methods for learning acausal networks are fairly well developed. These methods often employ assumptions to facilitate the construction of priors, including the assumptions of para...

متن کامل

Causal Network Learning from Multiple Interventions of Unknown Manipulated Targets

In this paper, we discuss structure learning of causal networks from multiple data sets obtained by external intervention experiments where we do not know what variables are manipulated. For example, the conditions in these experiments are changed by changing temperature or using drugs, but we do not know what target variables are manipulated by the external interventions. From such data sets, ...

متن کامل

Causal Learning and Explanation of Deep Neural Networks via Autoencoded Activations

Deep neural networks are complex and opaque. As they enter application in a variety of important and safety critical domains, users seek methods to explain their output predictions. We develop an approach to explaining deep neural networks by constructing causal models on salient concepts contained in a CNN. We develop methods to extract salient concepts throughout a target network by using aut...

متن کامل

Learning Bayesian Network Structure using Markov Blanket in K2 Algorithm

‎A Bayesian network is a graphical model that represents a set of random variables and their causal relationship via a Directed Acyclic Graph (DAG)‎. ‎There are basically two methods used for learning Bayesian network‎: ‎parameter-learning and structure-learning‎. ‎One of the most effective structure-learning methods is K2 algorithm‎. ‎Because the performance of the K2 algorithm depends on node...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006